RBA: An Integrated Framework for Regression based on Association Rules

نویسندگان

  • Aysel Ozgur
  • Pang-Ning Tan
  • Vipin Kumar
چکیده

This paper explores a novel framework for building regression models using association rules. The model consists of an ordered set of IF-THEN rules, where the rule consequent is the predicted value of the target attribute. The approach consist of two steps: (1) extraction of association rules, and (2) construction of the rule-based regression model. We propose a pruning scheme for redundant and insignificant rules in the rule extraction step, and also a number of heuristics for building regression models. This approach allows discovery of global patterns, offers resistance to noise, while building relatively simple models. We perform a comparative study on the performance of RBA against CART and Cubist using 21 real-world data sets. Our experimental results suggest that RBA outperforms Cubist and are equally as good as CART in many data sets, and more importantly, there are situations where RBA is significantly better than CART, especially when the number of noise dimensions in the data is large.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Integrated DEA and Data Mining Approach for Performance Assessment

This paper presents a data envelopment analysis (DEA) model combined with Bootstrapping to assess performance of one of the Data mining Algorithms. We applied a two-step process for performance productivity analysis of insurance branches within a case study. First, using a DEA model, the study analyzes the productivity of eighteen decision-making units (DMUs). Using a Malmquist index, DEA deter...

متن کامل

An Integrated Human Resource Planning Framework for Project-based Organizations in Oil and Gas Industry

The complexities of the oil industry, combined project-based organizations’ complexities, have led the traditional planning of HR being failed. The success of these organizations is based on integrative human resource planning. To this end, the purpose of this study was to determine the factors and components of human resource planning in oil and gas project-based organizations and providing an...

متن کامل

Retaining Customers Using Clustering and Association Rules in Insurance Industry: A Case Study

This study clusters customers and finds the characteristics of different groups in a life insurance company in order to find a way for prediction of customer behavior based on payment. The approach is to use clustering and association rules based on CRISP-DM methodology in data mining. The researcher could classify customers of each policy in three different clusters, using association rules. A...

متن کامل

Introducing an algorithm for use to hide sensitive association rules through perturb technique

Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...

متن کامل

The Nexus among Resource Based Theory, Marketing Strategy, and Firm Performance: An Integrated Framework

The purpose of this article is to present the link among resource based theory, marketing strategy, and firms’ performance in order to propose integrative framework showing how the three constructs are linked. It is organized based on a review of academic literature on resource based theory and marketing strategy chronicled in major marketing journals up to December 2015. Besides, the paper ref...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004